analyse compares a set of patches against each other to find their similarity quotient.

Successive runs of analyse can be optimised as a number of comparisons have been done previously.

analyse caches the results of it's analysis in an evaluation file. rate reads this result and generates the clusters. evaluation results is essentially a 2D array.Defining similarity between a each of pair of patches in a set of patches.

consider set of existing patches A which have been analysed. A new set of patches B arrive.

We want to analyse A U B. How to do this?

We want to figure out the set of patches that haven't been analysed yet. How? patches in patch-groups are bound to have evaluation results cached. Thus to_be_analysed = victims - patch_groups_patches

to_be_analysed x to_be_analysed + to_be_analysed x evaluation_results_patches = to_be_analysed x (to_be_analysed + evaluation_results_patches) = to_be_analysed x victims

evaluation result -- rate --> clusters --> patch-groups

patch-groups 1 2 3 6 7 10 11

1,6,10,11 vs 1,6,10,11

new 12,13,14,15